Machine learning models to predict traumatic brain injury outcomes in Tanzania: Using delays to emergency care as predictors

Constraints to emergency department resources may prevent the timely provision of care following a patient’s arrival to the hospital. In-hospital delays may adversely affect health outcomes, particularly among trauma patients who require prompt management. Prognostic models can help optimize resource allocation thereby reducing in-hospital delays and improving trauma outcomes. The objective of this study was to investigate the predictive value of delays to emergency care in machine learning based traumatic brain injury (TBI) prognostic models. Our data source was a TBI registry from Kilimanjaro Christian Medical Centre Emergency Department in Moshi, Tanzania. We created twelve unique variables representing delays to emergency care and included them in eight different machine learning based TBI prognostic models that predict in-hospital outcome. Model performance was compared using the area under the receiver operating characteristic curve (AUC). Inclusion of our twelve time to care variables improved predictability in each of our eight prognostic models. Our Bayesian generalized linear model produced the largest AUC, with a value of 89.5 (95% CI: 88.8, 90.3). Time to care variables were among the most important predictors of in-hospital outcome in our best three performing models. In low-resource settings where delays to care are highly prevalent and contribute to high mortality rates, incorporation of care delays into prediction models that support clinical decision making may benefit both emergency medicine physicians and trauma patients by improving prognostication performance.


Introduction
Traumatic brain injury (TBI) is a leading cause of trauma related death and disability worldwide, affecting an estimated 69 million people annually [1].The global burden of TBI is disproportionately endured by low-and middle-income countries (LMICs) which account for 72.0% of all TBI cases and over 90.0% of all trauma related deaths [1,2].Among LMICs, estimates suggest the incidence of TBI is largest in sub-Saharan Africa [3].Moreover, in comparison to high-income countries (HICs), TBI patients in the sub-Saharan region experience higher rates of mortality [4,5].
High mortality rates among TBI patients in sub-Saharan LMICs may be partially explained using the three-delay model.The three-delay model was originally developed as a framework for assessing high maternal mortality rates in low-income settings [6].More recently, the model has been used to understand poor outcomes among emergency care recipients [7].The model focuses on three main causes of poor emergency patient outcomes: delays in deciding to seek care, delays in reaching a health facility, and delays in receiving adequate and appropriate treatment [7].
In the sub-Saharan region, all three delays affect trauma patients.Delays in seeking care are largely dependent on an individual's perceived need for care and their ability to access effective and affordable treatment, both of which can be influenced by cultural, geographical, and socioeconomic factors.Delays in reaching a health facility can be reduced through the use of national prehospital emergency care systems.However, less than 9.0% of Africa's population has access to prehospital emergency care services [8].Lastly, delays in receiving appropriate treatment depend on an emergency department's (ED's) availability of medical professionals and equipment.In sub-Saharan LMICs, in-hospital treatment delays have been associated with increased TBI patient mortality [9][10][11][12].ED healthcare providers can play a critical role in reducing all three delays through research, policy, and advocacy.
From the perspective of an ED healthcare provider, and in a setting of limited resources, the first step in reducing treatment delays is optimizing the allocation of resources to patients.Prognostic models are an innovative solution that can help optimize clinical decision making to maximize positive patient outcomes.In the field of medicine, a prognostic model uses patient information to estimate the probability of a clinical outcome thereby supporting clinical decision making [13][14][15][16].A TBI specific prognostic model would allow point of care healthcare providers to readily assess a TBI patient's prognosis at any point during the patient's hospital stay, using the patient's most up to date clinical information.Consequently, in health systems lacking trained and specialized professionals, a prognostic model could bridge gaps in clinical decision making.
Our team previously constructed and internally validated multiple machine learning based TBI prognostic models using patient data from a tertiary referral hospital in Tanzania [17].However, these prognostic models do not include time delays that represent the three-delay model described above.The objective of this study was to create variables representing patient delays to emergency care, incorporate these variables as predictors in our baseline prognostic models, and assess how the inclusion of these predictors impact the performance of each model.To our knowledge, no TBI prognostic model to date has used delays to emergency care as a predictor of patient prognosis.

Study design and setting
This study is an analysis of clinical data from Kilimanjaro Christian Medical Centre (KCMC) in Moshi, Tanzania.KCMC is a tertiary referral hospital serving a population of over 15 million people [18].Approximately 1,000 TBI patients present to the KCMC ED each year, one third of whom are admitted to the intensive care unit (ICU) [12].

Study participants
This study uses a registry of 3,209 TBI patients admitted to the KCMC ED.The registry was collected prospectively from 2013 to 2017 and includes information on demographics, vital signs, injury characteristics, time to care, care received, and outcomes [12].Patients were included in the registry if they presented with acute TBI (less than 24 hours since injury occurrence) of any severity and were evaluated by a physician in the ED.TBI patients who did not survive long enough to be evaluated by a physician in the ED, who presented for follow-up, or who presented with non-acute TBI were not included in the registry.

Study variables
Each of the models assessed in this study included the following variables as predictors: age, sex, mechanism of injury, intention of injury, day of injury alcohol use, temperature, respiratory rate, heart rate, systolic and diastolic blood pressure, pulse oxygen, pupil reactivity, Glasgow Coma Score (GCS), ED disposition, and twelve time to care variables.
The twelve time to care variables included time from (1) injury occurrence to hospital arrival, (2) hospital arrival to physician arrival, (3) physician arrival to lab tests ordered, (4) physician arrival to chest x-ray, (5) physician arrival to skull x-ray, (6) physician arrival to brain CT scan, (7) physician arrival to administration of fluids, (8) physician arrival to administration of oxygen, (9) physician arrival to surgeon arrival, (10) physician arrival to TBI surgery, (11) physician arrival to non-TBI surgery, and ( 12) physician arrival to intensive care unit (ICU) admission.Time stamps for each event were recorded and entered into the TBI registry upon patient encounter/event occurrence.Our time to care variables were then calculated in the background of our registry as the difference between time stamps for different events.
The outcome predicted by our models was a patient's Glasgow Outcome Score (GOS) dichotomized as good (4)(5) or poor (1)(2)(3).The Glasgow Outcome Scale is a validated measure used to assess recovery among trauma and head injury patients [19].The score ranges from one to five with the following categories and is typically assigned at hospital discharge: (1) death, (2) persistent vegetative state, (3) severe disability, (4) moderate disability, and (5) low disability.Low disability is defined as a return to normal life with no more than minor cognitive deficits.GOS was calculated at hospital discharge for all patients, except those who died during hospitalization.We treated GOS as dichotomous, rather than continuous, because few patients in our sample had moderate GOS scores.

Categories for time to care variables
The South African Triage Scale (SATS) is a tool used to assist patient triage in low-income EDs, and which has been validated in clinical settings across numerous LMICs [20][21][22][23][24][25][26][27].The SATS defines standards for the timely care of patients presenting to EDs.Using these standards as a reference, we considered the following categories for our twelve time to care variables: 1.0 hours or less, 1.1-4.0hours, 4.1-12.0hours, and 12.1 hours or greater.

Additional categories for time to care variables
Logically, patients in our registry who did not receive a certain procedure (e.g.x-ray, CT scan, oxygen, surgery, etc.) also did not have a recorded delay to receiving that procedure.For all such patients in our registry, we attempted to evaluate their need for any procedure they did not receive.Our evaluation of a patient's need for a procedure was based on their vital sign information upon hospital presentation.We dichotomized patient need for a procedure as "needed" or "not needed."

PLOS GLOBAL PUBLIC HEALTH
Needing fluids was defined as being hypotensive (<100 mmHg systolic blood pressure).Needing oxygen was defined as being hypoxic (<92% pulse oxygen) or having a GCS of 8 or less.Needing a brain CT scan was defined as having a GCS of 13 or less [28][29][30][31].Our registry lacked sufficient data to evaluate a patient's need for any other procedures.For example, with the data available in our registry we had no way of evaluating a patient's need for surgery or ICU admission.
Our evaluation of patient need allowed us to add two additional categories to three variables: time to fluids, time to oxygen, and time to brain CT scan.These two additional categories were "patient needed but did not receive the procedure" and "patient did not need and did not receive the procedure."We also added one additional category to three other variables: time to TBI surgery, time to non-TBI surgery, and time to ICU admission.This additional category was "patient did not receive the procedure."

Data pre-processing
All data processing was performed using the statistical software R. Variables with more than 20% of observations missing were removed from our analysis.Ten iterations of multiple imputation by chained equations, using the mice package in R software, were used to impute missing data for all remaining variables [32].After imputation and conversion to indicator variables, highly correlated variables (correlation coefficient � 0.9 or � -0.9) as well as those with near-zero variance were removed.Our final analysis included 48 predictors and 3,140 patients.

Predictive modeling
We produced eight different predictive models using eight different machine learning algorithms: Artificial Neural Network, Bagged Tree, Bayesian Generalized Linear Model, Gradient Boosting Machine, K-Nearest Neighbor, Random Forest, Ridge Regression, and Single C5.0 Ruleset.To train and internally validate each model, we used repeated cross validation with five repetitions of ten-fold partitioning.The measure used to define the best performing model was the area under the receiver operating characteristic (ROC) curve (AUC).However, we also report accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) using a standard classification threshold of 0.50 for all models.To compare predictor importance across each of our eight models, we used the varImp function in R software's caret package [33].The varImp function measures variable importance differently for each model type.However, to allow comparison between model types, the variable importance measures are scaled to have a minimum and maximum value of 0 and 100, respectively.Thus, variables with values of 100 are most important to a model's prediction ability while variables with values of 0 are not at all important.

Ethics statement
This study was approved by Duke University's institutional review board, Tanzania's National Institute of Medical Research, and KCMC's ethics committee.Participant consent was waived as this study is a secondary analysis of a de-identified clinical registry.

Sociodemographic, injury, and clinical characteristics
Overall, 2,786 (88.7%) patients experienced a good outcome and 354 (11.3%) patients experienced a poor outcome.The mean age of our sample was 30.8 years and 34.2 years for the good PLOS GLOBAL PUBLIC HEALTH and poor outcome groups, respectively.Patients in the good and poor outcome groups were predominately male (82.0%, 84.2%) and most sustained injury through a motorcycle related road traffic incident (50.1%, 50.0%).A minority of patients in the good and poor outcome groups reported day of injury alcohol use (25.7%, 23.7%).From a clinical perspective, our outcome groups differed significantly with regard to one vital sign.Patients in the good outcome group had a mean pulse oxygen of 96.4.In contrast, patients in the poor outcome group had a mean pulse oxygen of 89.1.The mean GCS score in the good and poor outcome groups was 13.9 and 8.1, respectively (Table 1).

Time to care characteristics
In both the good and poor outcome groups the most common time to ED arrival was more than 12.0 hours after injury occurrence (31.6%, 33.3%).The most common wait time from ED arrival to physician evaluation for both the good and poor outcome groups was 0.0 to 1.0 hours (94.5%, 96.9%).Following physician evaluation, a majority of good and poor outcome patients waited 0.0 to 1.0 hours for lab tests (47.1%, 60.7%), 0.0 to 1.0 hours for a chest x-ray (48.4%, 51.4%), and 0.0 to 1.0 hours for a skull x-ray (48.6%, 54.5%).Most good outcome patients did not receive and did not need a brain CT scan (78.2%) whereas most poor outcome patients needed, but did not receive a brain CT scan (71.2%).A majority of good and poor outcome patients did not receive and did not need fluids (64.5%, 44.9%), and did not receive and did not need oxygen (94.4%, 54.8%).Most patients in both the good and poor outcome groups did not receive a TBI surgery (82.9%, 78.0%), a non-TBI surgery (90.6%, 92.1%), or ICU admission (93.9%, 70.1%) (Table 2).

Model performance
We constructed eight different machine learning based prognostic models.The AUC of each model is depicted in Fig 1 .Our Baysesian generalized linear model produced the largest AUC, with a value of 89.5 (95% CI: 88.8, 90.3).Our Bayesian model had an accuracy of 0.871, a sensitivity of 0.892, a specificity of 0.748, a PPV of 0.955, and an NPV of 0.533.
We compared the AUC of our eight models to the AUC of our eight models without our time to care to variables.For each of our eight models, the AUC increased with the inclusion of our time to care variables (Table 3).The largest increase in AUC occurred in our k nearest neighbors model, which gained 5.1 percentage points.The lowest increase in AUC occurred in our single c5.0 ruleset model, which gained 0.8 percentage points.Our best performing model, the Bayesian generalized linear model, gained 3.0 percentage points in AUC with the inclusion of our time to care variables.
We also compared the accuracy, sensitivity, specificity, PPV, and NPV of our eight models with and without the time to care variables (Table 4).Excepting the K-nearest neighbors model, most performance metrics increased with the addition of time to care variables.However, gains were minimal and often no more than three percentage points.
Lastly, we compared predictor importance across our top three performing models (Fig 2).In our Bayesian generalized linear model and ridge regression model, eight of the top twelve predictors were time to care variables.In our third best performing model, three of the top five predictors were time to care variables.Across these models, time to brain CT scan, time to oxygen, time to ICU admission, and time to fluids appear to be the most important time to care variables.

Discussion
To our knowledge, our prediction models are the first to incorporate time to emergency care as a predictor of TBI patient outcomes in a low-income setting.Our time to care variables  PLOS GLOBAL PUBLIC HEALTH encompass both receipt of hospital care (whether or not a patient received a specific diagnostic or treatment procedure) and delays to hospital care (how long a patient waited before receiving a specific diagnostic or treatment procedure).The value of time to care variables lie not only in  PLOS GLOBAL PUBLIC HEALTH their potential to improve model performance, but also in their potential to improve translation of machine learning based prediction models into clinical decision support tools.Based on the results presented in this study, we suggest the following: inclusion of time to care predictors (1) increases the clinical relevance of TBI prediction models, (2) improves the usefulness of TBI prediction models as clinical decision support tools, and (3) improves the usability of TBI prediction models as clinical decision support tools.

Clinical relevance and time to care
In accordance with the Prognosis Research Series on prognostic model research, we updated our prediction models to include twelve new time to care variables as predictors [34].In addition, we compared the performance of our baseline and updated models using discrimination metrics [35].Our results provide two important points regarding the clinical relevance of time to care in TBI prediction models.First, time to care variables improve model performance.Inclusion of our twelve time to care variables improved prediction performance in each of our models as indicated by an increase in AUC (Table 3) and other performance metrics (Table 4).Our updated models not only outperformed our baseline models, but also performed similarly to other published TBI prediction models.In a systematic review of 102 TBI prediction models, the highest reported AUC was 89.0 [36].In a similar review, only two of eleven identified TBI prediction models reported AUC's greater than 80.0 [37].An additional review of neurosurgical prediction models identified only one machine learning based TBI prognostic model, and it achieved an AUC of 89.0 [38].With an AUC of 89.5, our best performing model exemplifies the value that time to care variables may add to a prognostic model's ability to make TBI outcome predictions.
Second, time to care variables take priority over other model predictors.In our top three performing updated models, time to care variables encompassed many of each model's most important predictors (Fig 2).In other words, time to care variables contributed to model performance more substantially than other clinical and sociodemographic predictors.Notably, the categories "not received, needed" and "not received, not needed" within our time to care variables were often ranked most important to model performance.This suggests that patient need for a specific procedure may be a more powerful predictor of TBI outcome than the length of time a patient spends waiting to receive that procedure after being assessed by a physician.Nonetheless, to support TBI prognostication, ED healthcare providers may benefit from prioritizing the collection of time to care information over other clinical and sociodemographic information when faced with limited resources that may prevent the collection of other more clinically complex indicators that predict TBI outcomes.

Usefulness and time to care
Clinical prediction models typically provide a patient's risk of some outcome of interest.Our models, for example, predict the risk of a poor TBI outcome.Moreover, the inclusion of time to care variables in our models improves prediction performance which suggests that future efforts to build tools for TBI prognosis may benefit from taking into account information regarding delays to different forms of emergency care.However, physicians agree that knowing just the risk of an outcome is of minimal clinical use [39].For a prediction model to be more useful in a clinical setting, it should inform whether or not to provide a specific treatment or diagnostic procedure [40].Our time to care variables include categories that indicate whether or not a patient received lab tests, a chest x-ray, a skull x-ray, a brain CT scan, fluids, oxygen, TBI surgery, non-TBI surgery, and ICU admission.Thus, our prediction models could, theoretically, inform the provision of these procedures.For example, our models could be structured to give a patient's risk of a poor in-hospital outcome under the assumption that the patient receives fluids and the assumption that the patient does not receive fluids.An ED healthcare provider can then compare the risk of a poor in-hospital outcome in each scenario to decide whether or not the patient would benefit from fluids.Few TBI models to date can inform the provision of specific procedures.In Perel et al.'s systematic review, less than 20% of identified TBI prediction models included treatment and diagnostic predictors [36].Ultimately, our time to care variables could increase the usefulness of TBI prediction models to clinical decision makers not only by improving overall TBI prognosis, but also by informing the provision of specific treatment and diagnostic procedures.However, it must be reiterated that our time to care variables cannot estimate the causal effect of receiving vs. not receiving a specific procedure.These variables simply allow a prognostic model to predict a patient's outcome with and without a specific procedure.

Usability and time to care
A usable prediction model allows the user to obtain the output with ease and efficiency.In a survey of 137 physicians in the United States two of the most cited limitations to using prognostic tools in clinical practice were poor accessibility and a time consuming and cumbersome process [41].Moreover, prognostic research in general emphasizes the importance of using clearly defined predictors that can be easily measured so as to maximize usability [42].While trauma related time stamps are not always easy to record, especially in LMIC settings with limited resources, our time to care predictors are both unambiguous and less complicated to use in comparison to other clinical variables.With any of our time to care predictors the user must only enter whether or not a patient received a procedure and, if so, how long the patient waited to receive the procedure since being evaluated by a physician.Furthermore, time to care information is simple to collect relative to other clinical variables.Receipt of care only requires a record of whether or not a patient received a procedure.A delay to care only requires a record of the time at which an event occurred.Consequently, time to care information is accessible in any context, making it a potentially valuable data source.In addition, if collecting information on all twelve time to care variables proved to be cumbersome, users could also prioritize those variables which we show to be most important to model prediction (ex.time to brain CT scan, time to oxygen, time to ICU admission, and time to fluids).

TBI outcomes and time to care
The first sixty minutes following sustained trauma has been termed the "golden hour" among emergency medicine providers [43].The "golden hour" represents a window of time after which the probability of mortality significantly increases in the absence of definitive trauma PLOS GLOBAL PUBLIC HEALTH care [44].Although the concept of a "golden hour" has been widely promoted, there is little evidence to suggest trauma morbidity and mortality significantly increase following sixty minutes without management or treatment [44,45].Regardless of any evidence that undermines or supports the notion of a "golden hour," it is generally accepted that trauma patients should receive care as soon as possible [46].The positive impact of our time to care variables on the performance of our prediction models suggests an inverse association between length of in-hospital emergency care delays and good TBI outcomes.Our results therefore provide further support to the notion that delays to appropriate care may significantly impact TBI outcomes.

Future steps
With the inclusion of new predictors, our TBI prediction models must be externally validated.External validation tests a prediction model's performance on a dataset that was not used for model development, and thus assesses the model's generalizability.External validation is therefore a necessary step towards translating a prediction model into a clinically useful tool [47].Despite this crucial step, external validations remain unreported for many published TBI prognostic models [36].To date, the CRASH and IMPACT models remain the most comprehensively validated TBI prognostic models constructed [34,48,49].As our team continues to establish TBI registries in settings outside of Tanzania, we hope to have the data necessary to externally validate our models in a future study.It is also important to note that the most practical prediction model is one that achieves a desired level of performance with the fewest number of predictors.In a future study, with updated data from our TBI registry, we plan to use a feature selection approach to identify whether or not there is a subset of our time to care and or clinical/sociodemographic variables that produces a more parsimonious model, and therefore a more practical model for implementation in resource limited settings.

Limitations
Although the inclusion of time to care variables improved our model's performance, we must consider the limitations of these variables.First, a patient's time of injury occurrence was selfreported.While our registry includes only patients who sustained a TBI no more than 24 hours prior to hospital arrival, the accuracy of our variable time from injury occurrence to hospital arrival may suffer from recall bias.Second, the registry used in this study only includes patients who sustained a TBI no more than 24 hours prior to hospital arrival.Consequently, the registry likely underrepresents the most severe cases of TBI (i.e.those who died of their injury before reaching the hospital).Given that these patients are expected to benefit the most from timely care, the selection bias inherent in our registry likely underestimates the predicative power of our time to care variables.Third, many patients in our registry did not receive a brain CT scan, fluids, or oxygen.Unfortunately, limitations in our dataset prevent us from knowing exactly why a patient did or did not receive a specific procedure.Having this information would help to both contextualize and generalize our findings.Fourth, we had no way of assessing a patient's need for TBI surgery, non-TBI surgery, or ICU admission and therefore could not assess the value of surgical or ICU need to model performance.Lastly, it is important to highlight that for most models the addition of time to care variables increased model performance metrics by no more than five percentage points.While such an increase indicates the predicative power of time to care variables, the clinical significance of a five-percentage point increase in AUC and other performance metrics can only be assessed through additional studies that measure the costs and benefits of using our models in clinical practice to facilitate TBI prognosis.

Conclusion
Our study assesses the value of need for care and time to care as predictors of in-hospital outcomes in machine learning based TBI prognostic models.We found that our predictors not only improve model performance, but also comprise a majority of the predictors that are most important to the predictive ability of each model.Given these results, and the simplicity with which need for care and time to care data can be collected, patient needs and patient delays may prove to be an easily accessible and valuable source of information when applying prediction models in low-income settings.

Fig 1 .
Fig 1.Comparison of receiver operating characteristic (ROC) curves across all models after the inclusion of time to care variables.The Bayesian generalized linear model had the largest area under the ROC curve (AUC).https://doi.org/10.1371/journal.pgph.0002156.g001

Fig 2 .
Fig 2. Twelve most important predictors in the top three performing models.Predictor importance is scaled to a range of 0 to 100 for easy comparison across all modeling techniques.Higher values indicate greater importance to a model's prediction ability.If a model is not represented by a predictor, it means the predictor was not one of the top twelve most important variables for that model.https://doi.org/10.1371/journal.pgph.0002156.g002